Filtering Offensive Language in Online Communities using Grammatical Relations

نویسندگان

  • Zhi Xu
  • Sencun Zhu
چکیده

Offensive language has arisen to be a big issue to the health of both online communities and their users. To the online community, the spread of offensive language undermines its reputation, drives users away, and even directly affects its growth. To users, viewing offensive language brings negative influence to their mental health, especially for children and youth. When offensive language is detected in a user message, a problem arises about how the offensive language should be removed, i.e. the offensive language filtering problem. To solve this problem, manual filtering approach is known to produce the best filtering result. However, manual filtering is costly in time and labor thus can not be widely applied. In this paper, we analyze the offensive language in text messages posted in online communities, and propose a new automatic sentence-level filtering approach that is able to semantically remove the offensive language by utilizing the grammatical relations among words. Comparing with existing automatic filtering approaches, the proposed filtering approach provides filtering results much closer to manual filtering. To demonstrate our work, we created a dataset by manually filtering over 11,000 text comments from the YouTube website. Experiments on this dataset show over 90% agreement in filtered results between the proposed approach and manual filtering approach. Moreover, we show the overhead of applying proposed approach to user comments filtering is reasonable, making it practical to be adopted in real life applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Semantic Relations with World Knowledge for Question Answering

Two research directions are to be explored in realizing our group’s TREC QA system in 2006. The first one is to investigate the possibilities of applying linguistically sophisticated grammatical framework in tackling the realworld natural language processing task such as question answering. The other is to exploit the possible world’s entities and relations as described in online encyclopedia i...

متن کامل

Scaffolding Moves by Learners in Online Interactions

Learners can collaborate with each other to achieve a lesson objective. In the collaboration, they can provide each other with guidance in order to identify mistakes and improve their achievements. With the rise of online instructions, this small-scale exploratory study aimed to see how proficient learners guided their less proficient classmates in correcting the grammatical accuracy of sentenc...

متن کامل

Dealing with Internet Trolling in Political Online Communities: Towards the This Is Why We Can't Have Nice Things Scale

Internet trolling has become a popularly used term to describe the posting of any content on the Internet which is provocative or offensive. This is different from the original meaning online in the 1990s, which referred to the posting of provocative messages for humourous effect. Those systems operators (sysops) who run online communities are finding they are being targeted because of abuse po...

متن کامل

Scaffolding Moves by Learners in Online Interactions

Learners can collaborate with each other to achieve a lesson objective. In the collaboration, they can provide each other with guidance in order to identify mistakes and improve their achievements. With the rise of online instructions, this small-scale exploratory study aimed to see how proficient learners guided their less proficient classmates in correcting the grammatical accuracy of sentenc...

متن کامل

The Representation of Social Actors in the Graduate Employability Issue: Online News and the Government Document

This paper presents the first part of a larger study on the issue of graduate employability in Malaysia as construed in public discourse in English, a language of power in Malaysia. The term employability itself has many definitions depending on the requirements of government and industry, and in the case of Malaysia, the English-language ability of graduates is inseparable from graduate employ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010